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1.  Introduction 

In  a recent  technical  paper  [4]  a method  of  forecasting  the  size  of 
force  (military  organization)  subject  to  random  withdrawals  was  provided. 
The  method  presented  there  considers  the  records  on  the  size  of  the  force 
in  six  months  intervals,  estimates  the  retention  rates  of  certain  subgroups 
(cohorts)  at  the  time  of  forecasting  and  provides  a prediction  interval 
on  the  anticipated  size  of  the  force  six  or  twelve  months  ahead.  The 
method  developed  in  [4]  was  specifically  oriented  to  the  problem  of 
forecasting  the  size  of  the  Marine  Corps  and  applied  in  [5]  and  [6]  to 
estimate  the  total  size  of  the  Marine  Corps.  As  explained  in  [4],  we 
distinguish  between  four  phases  in  the  service  of  an  enlistee.  Phase  I 
consists  of  the  first  six  months  of  basic  training;  Phase  II  follows  Phase 
I and  continues  until  six  months  before  the  termination  of  the  (first) 
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Project  NR  347-020  at  the  Program  in  Logistics,  The  George  Washington 
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contract.  Phase  III  consists  of  the  last  six  months  of  the  (first) 
contract.  Phase  IV  is  the  phase  of  career  service.  We  have  shown  in 
[4]  that  the  first  three  phases  are  similar  to  those  of  reliability  systems, 
in  which  the  failure  rate  function  attains  after  an  initial  phase  of  high 
failure  rate  a phase  of  stable  failure  rate  (Phase  II)  and  then  reaches  a 
phase  of  wearout  (Phase  III).  In  the  present  paper  we  develop  the  theory 
for  Phase  II  forecasting  and  compare  three  alternative  procedures.  One 
procedure  provides  conditional  maximum  likelihood  estimates  (CMLE)  of 
the  limits  of  prediction  intervals.  The  second  procedure  determines 
tolerance  intervals  and  the  third  procedure  provides  Bayes  prediction 
intervals  for  the  size  of  the  force.  We  show  that  although  the  CMLE- 
procedure  is  less  conservative  than  the  other  two  procedures,  it  has 
yielded  prediction  intervals  for  the  total  force  which  contain  the  actual 
values  (on  retroactive  basis).  This  is  due  to  the  fact  that  the  samples 
are  large  and  the  standard  errors  of  the  estimates  are  sufficiently  small. 

The  Phase  II  forecasting  is  performed  for  cohorts  consisting  of 
enlistees  classified  according  to  the  following  factors:  time  of  entry, 

length  of  contract,  race  and  education.  Haber  has  shown  in  [2]  that 
among  many  factors  that  can  possibly  affect  the  retention  rates  the 
above  factors  are  most  signif iciant.  We  base  the  model  of  the  present 
study  on  the  same  factors.  The  structure  of  the  data  as  related  to  these 
factors  is  explained  in  Section  2.  In  Section  3 we  discuss  the  basis 
statistical  model  for  adaptive  forecasting.  In  Section  t we  develop  the 
concept  of  prediction  intervals  as  a tool  of  forecasting.  In  Section  5 
we  discuss  CMLE- prediction  limits,  tolerance  limits  and  Bayes  prediction 
limits  for  the  size  of  cohorts  which  have  been  in  Phase  II  more  than  six 
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months  (at  least  12  months  in  service).  All  the  procedures  of  forecasting 
are  compared  numerically  by  applying  them  retroactively  on  actual  Marine 
Corps  data  for  the  period  January  19&9  to  July  1972.  Section  6 is  , 

devoted  to  the  forecasting  problem  for  cohorts  just  entering  Phase  II. 

The  methods  developed  are  related  to  those  of  Section  5*  FORTRAN  programs 
according  to  which  the  examples  have  been  computed  are  available  and  can 
be  obtained  upon  request.  Finally  we  mention  that  although  there  is  vast 
literature  on  subjects  which  are  touched  on  or  mentioned  in  the  present 
study,  either  from  the  point  of  view  of  man-power  statistics  or  from 
the  areas  of  reliability,  Poisson  processes,  statistical  distributions 
and  Bayesian  analysis  we  have  cited  in  the  paper  specifically  only 
references  which  are  most  relevant  to  the  discussion. 

2.  The  Data  and  Structural  Components 

The  random  variables  considered  are  the  number  of  enlistees  remaining 
in  service  at  six  months  intervals,  in  group  (cohorts)  classified 
according  to  the  following  characteristics: 

A.  Length  of  enlistment  contract  (LC) 

Two  years, 

Three  years, 

Four  years, 

B.  Education  Level  (ED) 

Less  than  high  school, 

High  school  or  more, 

C.  Race  (r) 

White, 

Non-White, 


i = 1 
i = 2 
i = 3 


j = 1 (LHS) 
j = 2 (AHS) 


k = 1 (W) 

k = 2 (NW) 


J 
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D.  Time  of  Entry  into  Service  (in  six  months  periods)  (EN) 

January  1-June  30,  1969  1 = 1 (I-69) 

July  1-December  31,  1969  X = 2 (7-69) 

Up  to  January  1-June  30,  1972  1 = 1 (1-72) 

In  the  following  table  we  present  the  observed  values  of  these  random 
variables  at  the  time  intervals  starting  on  July  1,  1971/  January  1,  1972 
and  July  1,  1972.  These  values  are  later  used  to  illustrate  the  forecasting 
procedures. 

As  in  the  previous  study  [4]  we  distinguish  between  three  phases  in 
the  service  of  first  enlistees. 

Phase  I:  the  first  six  months  of  basic  training. 

Phase  II:  the  period  following  Phase  I up  the  the  last  six  months 

before  termination  of  contract. 

Phase  III:  the  last  six  months  of  the  contract. 

In  Figure  1 we  illustrate  the  retention  of  cohorts  (in  percents)  in  relation 
to  their  three  phases  of  service. 


r 


TABLE  1 

NUMBER  OF  ENLISTEES  REMAINING  IN  SERVICE  IN  PERIODS 
STARTING  ON  JULY  1,  1971,  JANUARY  1,  1972  AND  JULY  1,  1972 
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TABLE  1 (Cont'd) 


LC 

ED 

R 

EN 

7-1-71 

1-1-72 

7-1-72 

4 

LHS 

W 

1-70 

2935 

2594 

2347 

4 

LHS 

NW 

1-70 

384 

327 

292 

4 

AHS 

W 

1-70 

3715 

3556  • 

3416 

4 

AHS 

NW 

1-70 

489 

454 

428 

4 

LHS 

W 

7-70 

3173 

2864 

2633 

4 

LHS 

NW 

7-70 

356 

321 

294 

4 

AHS 

W 

7-70 

4401 

4267 

4108 

4 

AHS 

NW 

7-70 

547 

511 

478 

4 

LHS 

W 

1-71 

3325 

3146 

2979 

4 

LHS 

NW 

1-71 

423 

410 

386 

4 

AHS 

W 

1-71 

3366 

3294 

3207 

4 

AHS 

NW 

1-71 

559 

546 

526 

4 

LHS 

W 

7-71 

3969 

3358 

3174 

4 

LHS 

NW 

7-71 

479 

402 

383 

4 

AHS 

w 

7-71 

4600 

4445 

4281 

4 

AHS 

NW 

7-71 

688 

632 

607 

The  Statistical  Model 


Let  X^kj£(t)  denote  the  number  of  enlistees  remaining  in  service 


at  the  end  of  the  t-th  epoch,  where  an  epoch  is  a six-month  period,  from 
the  group  (cohort)  combined  of  the  (i,  j,k,  £)  factors-combination.  The 
statistical  model  specifies  that  the  conditional  distributions  of 


Xi^jj(t+l)  given  ^jk£(t)  is  binomial  distribution  with  parameters 


X.  (number  of  Bernoulli  trails)  and  9 (t+l)  (retention  probability) 

ljKX-  ljKXr 


i.e.,  for  every  t ^ 1,  and  all  (i,  j,k,  H) 


Ljkl 


ijkjf 


LjkX 


(3.1) 


Generally  B(jJ,p)  denotes  a bionomial  distribution  corresponding  to  N 
Bernoulli  trials  and  probability  of  success  p,  0 < p < 1.  The  symbol 


~ in  (3.l)  designates  that,  conditionally  on  X. .(t),  the  random 

1 
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variable  X.  has  specified  binomial  distribution.  Indeed, 

the  number  of  enlistees  from  the  (i,j,k,  i)  cohort  remaining  in  service 
at  the  (t+l)st  epoch  can  assume  any  value  from  0 to  X. . .(t).  It 

1 JK  Xj 

is  assumed  that  each  individual  in  a given  cohort  has  the  same  retention 
probability  in  a given  future  period  and  that  the  events  of  retention  or 
withdrawal  of  different  individuals  are  independent.  This  model  may 
be  justifiably  criticized  if  applied  to  small  groups,  due  to  possible 
interactive  forces  among  the  members.  It  seems,  however,  to  be  an 
adequate  model  for  describing  the  retention  in  large  cohorts. 


k.  Prediction  Intervals 

After  observing  the  sequence  of  X- values  for  each  (i,  j,k,  l)  combina- 
tion, at  epochs  1, ...,t,  we  wish  to  forecast  their  possible  values  s 
epochs  in  the  future.  One  can  easily  verify  that  the  conditional 
distribution  of  X^^k^(t+s),  given  X^^Ct)  is  the  binomial 

B(xijkl(t)-  pi3ki(t’s))  "t,ere> 

Pijki(t>  S)  ' Sijki(t*v)’  t - 0, 1,  . . . , s - 1,2,  . . . . (I 


Accordingly,  if  the  sequence  of  retention  probabilities  9. 


*djk*  1 ijk4 


t 2 1}  is  known  one  can  determine  for  each  t and  s prediction 
intervals  (T^ ^(t+s),  T|i ^^(t+s) ),  satisfying  the  requirement  that,  for 
each  t ^ 0 

P[\lki<tts)  1 Xl!k«(tts>  s \,kl(t+B)lXi.ikl(t)’&.1ki)  1 1‘“  (1,'2> 


(1  - a)  is  a preassigned  probability  level. 

The  limits  of  these  prediction  intervals  depend  on  .^^(t)  and 
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on  6111(1  are  given  by  HijkA^t+s^  = 

-ijkj&(t+S)  ~ maximal  non- negative  integer,  k, 
such  that  B(kjx.  jki(t),PiJkjl(t,s))  s | 


(4.3) 


and 


T)ijk^(t+s)  - least  non- negative  integer,  k, 
such  that  B(k|Xi ^kjl(t),Pi^kjl(t,s))  s 1 - \ i 


(4.4) 


where  B(k|x,  ^k^( t),  P^  ^(t, s))  is  the  c.d.f.  of  the  corresponding 
binomial  distribution.  The  determination  of  prediction  limits  (^.3) 
and  (k.H)  requires  the  computation  of  the  corresponding  binomial  c.d.f. 's. 
In  the  cases  considered  here,  the  X- values  are  generally  in  the  order  of 
several  hundreds  or  thousands  (see  Table  l) . In  these  cases  the  numerical 
determination  of  the  binomial  c.d.f.  applies  the  normal  approximation 
(see  Johnson  and  Kotz  [3;  pp.  62]).  Based  on  this  approximation  we 
employ  the  following  large  sample  formulae  for  the  prediction  limits 


T! 


ijk/t+s)  " Xijk£(t)Fijk£(t,s)  + zl-a/2[Xijk£(t)  ‘ Pijk/t,s)  * Qijk/t,s)} 


1/2 


and 


^ijkje<'t+s^  ^jkA^^ijkJ^'  ^ “ zl-a/2  { ^jk/^ijk/^  ^ijk/1*  s^} 


(4.5) 

1/2 


where  Q^^^s)  = 1 - and  zi-a/2  iS  the  ^ " a/2)th  fractile 

of  the  standard  normal  distribution.  Generally  the  values  of  the  retention 
probabilies  P.  . .(t, s)  are  unknown  and  should  be  estimated  from  the  data. 

ijk* 

In  the  following  sections  we  discuss  the  problem  of  determining  the 
prediction  limits  when  the  retention  probabilities  are  unknown. 
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5.  Estimating  the  Prediction  Intervals 

In  the  present  section  we  duscuss  three  methods  of  estimating  the 
prediction  intervals,  when  the  retention  probabilities  are  unknown.  For 
the  sake  of  simplifying  the  presentations  we  restrict  attention  to  six 
months  forecasting,  s = 1.  We  discuss  here  only  cohorts  in  Phase  II, 
which  have  been  in  service  at  least  12  months,  i.e.,  t 2 2.  The  case 
t = 1 will  be  discussed  later. 

5. 1 CMLE- Prediction  Limits 

For  estimating  the  retention  probabilities  8.  . .(t+l)  t ^ 2,  we 

develop  methods  which  depend  only  on  the  observed  statistics  of  the 

cohort  under  consideration.  These  methods  can  be  modified  to  apply 

also  statistics  of  previous  cohorts  (as  will  be  indicated  in  Section  6). 

For  this  purpose  we  have  to  introduce  the  assumption  that  all  the 

cohorts  having  the  same  length  of  contract  and  the  same  education  race 

combination  behave  similarly,  with  respect  to  their  attrition,  irrespective 

of  the  period  of  entry  to  service.  The  methods  discussed  in  the  present 

section  are  free  of  such  an  assumption.  We  have  to  assume,  however,  that 

the  retention  probabilities  of  each  cohort  have  close  values  in  adjacent 

periods.  Thus,  we  can  estimate  9,  „(t)  by  the  conditional  maximum 

1 JK  x 

likelihood  (CMLE)  estimates 


8. 


■ W^W4-1’' 4 2 2- 


(5-1) 


This  estimator  employes,  for  each  cohort,  only  the  current  and  the  previous 
six  months  values  of  X.  Point  estimates  of  the  prediction  limits  for 
Xi^^(t+l)  are  obtained  by  substituting  in  formula  (U.3)  the  values  of  and 
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0. #(t)  for  P.  ,(t, s).  These  estimates  of  the  prediction  limits  are 
1 JKX  1 j KX 

labelled  the  CMLE- prediction  limits.  They  sire  given  by  the  formula 


(t)  + 


(5»-2) 


Indeed,  according  to  the  invariance  principle  of  maximum  likelihood 
estimation  (see  Zacks  [7;  pp.  223])  the  CMLE-limits  given  by  (5-2)  are 
themselves  conditional  maximum  likelihood  estimators  of  the  prediction 
limits.  Furthermore,  according  to  the  large  sample  theory  of  maximum 
likelihood  estimators  (see  Zacks  [7;  PP-  247])  the  sampling  distribution 
of  these  CMLE  limits  are  approximately  normal  around  the  true  prediction 
limits.  Accordingly  we  may  expect  that  in  about  25-50^  of  the  cases  the 
CMLE- prediction  intervals  (5-2)  will  be  too  short  and  will  not  cover  the 
true  prediction  intervals.  This  is  our  main  reason  for  developing 
alternative  methods  of  estimation.  On  the  other  hand,  due  to  the  large 
samples  under  consideration  and  the  strong  consistency  of  the  maximum 
likelihood  estimators  (see  Zacks  [7])  we  expect  that  the  CMLE- prediction 
limits  will  be  close  to  the  true  limits,  even  if  the  intervals  are  not 
entirely  covered.  Numerical  examples  will  be  given  in  the  sequel  to 
illustrate  this  point. 


5. 2 Tolerance  Limits  for  the  Prediction  Intervals 

One  type  of  tolerance  limits  can  be  defined  as  confidence  limits  to 
fractil-s  of  distribution.  We  derive  here  tolerance  limits  for  the 
prediction  intervals  (5*2)  within  a large  sample  framework.  It  is  well 
known  that  in  large  binomial  samples,  if  0 is  the  CMLE  of  6 then 


2 sin  ) has  approximately  normal  distribution  with  expectation 

2 sin  1(/7  ) and  variance  l/n,  where  n is  the  sample  size  (see  Johnson 

and  Kotz  [3;  pp.  65]).  Since  the  conditional  distribution  of  X.  ...(t+l) 

1 

given  X.^k^(t)  is  binomial  with  retention  probability  we 

consider  the  statistic 


Y.  „(t)  = 2 sir"1 


, t 2;  2 


The  mean  of  the  asymptotic  conditional  normal  distribution  of  Y.  . .(t), 

1 JK  jL 

given  Xljkt(t-1),  is 


Hljki(t)  = 2 sin'1 


eijki(t^  )> 


and  its  asymptotic  conditional  variance  is  l/X^^^(t-l) . 

Let  ^^(t+l)  rePreselTt  a statistic  as  in  (5-3)  based  on 
X.  ^ ( "t ) and  the  yet  unobserved  X.  • . .(t+l).  The  asymptotic  conditional 
distribution  of  Y. .,  .(t+l)  given  X.  .(t)  is,  according  to  our 

1JKX  lJKJfc 

assumption  that  = normal  with  mean  H„k^(t)  and 

variance  l/Xi-k^(t). 

We  determine  now  confidence  limits  for  the  unknown  value  of  H.  .,  .(t). 

1 jKx« 

A (1  - a/2)  lower  confidence  limit  for  H.  .,  .(t)  is,  according  to  the 

ljAX 

normal  approximation, 

W4>  “ W‘>  - zl-a/2  (5': 

Similarly,  a (l  - a/2),  upper  confidence  limit  for  H. .,  .(t)  is 

IJKXr 

approximately 


UijkX^  “ YijkX  + zl-a/2  /^ijkA^"1^ 


(5-6) 
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We  apply  now  the  Bonferoni  inequality 


P[A  0 B]  2 1 ■ p(a)  - p(b), 


(5-7) 


where  A is  the  event  that  the  confidence  limits  (5-^)  and  (5-5)  cover 

H.  (t)  and  B is  the  event  that  Y.  (t+l)  belongs  to  the  interval 
1JKX  ljKXr 


Hijk/^  - zl-a/2  A^ijk£(t) 


Accordingly,  the  probability  is  at  least  1 - 2a  at  the  limits 


(5.8) 


Y jk/  } 1-a/2l  XTT^I)  XTTt) 


(5-9) 


ijkl' 


ijk£ 


cover  the  future  value  Y. „(t+l).  The  limits  will  be  called  tolerance 

ljkX  

limits.  Finally,  since  the  function  2 sin  1(ST ),  for  0 < 0 < 1,  is 
strictly  increasing  we  employ  the  inverse  of  this  transformation  on  the 
two  tolerance  limits  of  Y.  .(t+l)  to  obtain  corresponding  tolerance 

1 JKX- 

limits  for  X...  .(t+l).  These  limits  are: 

1 

- <si“<YSt/2»2  • v11  *i0--5' 


(5-10) 


and 


0*> ' (sln(YS/2-))2  (W4)  + 1->  - -5  ’ 

where  Y^V^.(t)  and  Y^^.(t)  are  the  upper  and  lower  tolerance  limits 
ijitX  ij  kx 

°f  YijW(t*l)- 


(3.11) 


5.3  Bayes  Prediction  Limits 


The  forecasting  of  X^^Ct+l),  given  the  prior  data,  can  be 
attempted  also  from  a Bayesian  point  of  view.  The  limits  of  the  prediction 


intervals,  as  defined  in  Section  4,  depend  on  the  unknown  retention 


probabilities  0 


In  a Bayesian  framework  we  ascribe  these 


parameters  prior  distributions.  Given  the  past  observations  we  can  derive 


the  posterior  distributions  of  9...  At).  Let  G; 

1 J 1 

prior  distribution  function  of  0 4(t),  t = 2,3, 

lJKi' 

for  each  (i,  j,k,  l)  the  retention  probabilities  9^ 
have  the  same  prior  distributions.  Let 


distribution  of  0.  (t)  given  that  X 


X.  According  to  this 


posterior  distribution  we  can  compute  the  posterior  marginal  distribution 


X.  This  distribution  is  called  also 


given 


the  predictive  distribution  and  is  given  by 


Bayes  prediction  limits  for  X 


at  level  1 - a,  are  values 


such  that 


Aitchison  and  Dunsmore  [1]  consider  for  binomial  distributions  of  future 


observations,  beta-binomial  predictive  distributions,  given  the  number  of 


These  are  predictive  distributions 


obtained  by  ascribing  0 prior  beta  distributions.  In  our  specific 


problem  it  would  be  impractical  to  consider  such  beta-binomial  predictive 


distributions  from  the  numerical  point  of  view.  Instead  we  apply  the 
arcsin  transformations  (5-3) • As  before,  the  conditional  (large  sample) 


distribution  of  Y 


is  normal  with  expectation 


-15- 


H. 


(t)  = 2 sin  * (-/  0.  . (t) ) , and  variance  l/x...  -(t).  We 


ijk£ 


ijk£ 


ijkjt 


assume 


that  the  prior  distribution  of  is  normal  with  expectation  w 


2 2 

and  variance  r , i.e.,  ~N(w,t  ).  It  is  immediate  then  to 


verify  that  the  posterior  distribution  of  H.  .(t),  given  that 

1 J K SL 


■ y a“d  xijkl(t-l)  • * 18 


. 2 
1 + T X 


, 2 

1 + T X 


The  symbol  ~ designates  "distributed  like".  (5.1*0  means  that  the 


posterior  distribution  of  H.  . ,(t)  given  Y.  .(t)  = y and 

1 JKX  1 JK*» 


2 2 

X.  . .(t-l)  = x is  normal,  with  mean  (yr  x + w)/(l  + t x)  and  variance 
ljKx 


2 2 

t /(l  + t x).  We  will  consider,  the  limiting  posterior  distribution 


obtained  by  letting  t ' -*  00  (diffused  prior).  Under  this  condition 


Ljki 


jk/v 


N Y^vi.(t)  > 


' X.<„,(t-1)  y • 


ijkO 


From  (5.15)  we  obtain  that  the  predictive  distribution  of 


glken  X.jkl(t)  and  X.Jkl(t-l)  is 


Yijkt<t+l)lxiJk«<t)-xiJki(t-i)  ~ 


ijk£^  Xijkl(t"1) 


Hence,  the  Bayes  prediction  limits  for  given  X-^^t)  and 


Ljklv 


xidki(t-1)  are 


~ — . . 


— a. — ■ - ■ 


(5-1'+) 


(5-15) 


(5.16) 
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Yijki(t)  i 2W2'  * xiJk‘(t-D  ' • 

Notice  that  the  interval  given  by  (5*17)  is  always  contained  in  the 
tolerance  interval  (5.9).  If  and  w^^(t)  denote  the  upper  and 

lower  Bayes  prediction  limits  in  (5.17)  then  the  Bayes  prediction  limits 
for  are  resPectively 

■ <sl"<"™/2-»2<Wt)  + 11  - -5  c 


tS5kl(t)  ■ (sin(“S/2-))Z(XiJkl(t)  * D - -5  (5 

In  the  following  table  we  provide  numerical  comparisons  of  the  CMLE, 
tolerance  and  Bayes  prediction  limits  for  the  size  of  cohorts  in  Phase  II 
(t  > 2)  on  July  1,  1972.  We  provide  also  the  actual  size  of  the  cohorts. 


1 . 


TABLE  Z 


PREDICTION  LIMITS  FOR /THE  SIZE  OF  COHORTS  ON 


JULY  1, 

1972  (a 

» .05). 

1 

i 

k 

m 

CMLE 

Tolerance 

Bayes 

1 

1 

1 

i 

1401. 

1434. 

1382. 

1446. 

1393. 

1439. 

141*. 

1 

1 

2 

i 

507. 

520. 

496. 

523. 

502. 

521. 

499. 

1 

2 

1 

i 

3272. 

3303. 

3253. 

3315. 

3264. 

3307. 

3280. 

1 

2 

2 

i 

676. 

693. 

665. 

697. 

671. 

694. 

674. 

2 

1 

1 

i 

935. 

979. 

912. 

997. 

926. 

986. 

1009. 

2 

1 

1 

2 

876. 

916. 

855. 

932. 

867. 

922. 

910. 

2 

1 

1 

3 

1133. 

1165. 

1115. 

1177. 

1125. 

1170. 

1155. 

2 

1 

2 

1 

113. 

130. 

103. 

136. 

109. 

132. 

135. 

2 

1 

2 

2 

98. 

111. 

89. 

114. 

94. 

112. 

104. 

2 

1 

«• 

3 

158. 

169. 

149. 

171. 

154. 

170. 

163. 

2 

2 

1 

1 

1973. 

2009. 

1952. 

2023. 

1964. 

2014. 

2019. 

2 

2 

1 

2 

2277. 

2309. 

2257. 

2322. 

2269. 

2314. 

2301. 

2 

2 

1 

3 

1112. 

1133. 

1099. 

1140. 

1106. 

1136. 

1121. 

2 

2 

2 

1 

246. 

260. 

235. 

264. 

241. 

261. 

256. 

2 

2 

2 

2 

223. 

236. 

214. 

239. 

219. 

237. 

230. 

2 

2 

2 

3 

171. 

181. 

163. 

183. 

167. 

182. 

177. 

3 

1 

1 

1 

1467. 

1518. 

1440. 

1540. 

1456. 

1527. 

1570. 

3 

1 

1 

2 

1692. 

1749. 

1663. 

1773. 

1630. 

1758. 

1781. 

3 

1 

1 

3 

2261. 

2325. 

2223. 

2352. 

2247. 

2335. 

2347. 

3 

1 

1 

4 

2554. 

2616. 

2522. 

2643. 

2541. 

2627. 

2633. 

3 

1 

1 

5 

2952. 

3001. 

2924. 

3022. 

2940. 

3010. 

2979. 

3 

1 

2 

1 

147. 

166. 

136. 

173. 

142. 

169. 

165. 

3 

1 

2 

2 

167. 

187. 

156. 

194. 

162. 

190. 

190. 

3 

1 

2 

3 

266. 

291. 

252. 

300. 

260. 

294. 

292. 

3 

1 

2 

4 

279. 

300. 

266. 

207. 

274. 

302. 

294. 

3 

1 

2 

5 

391. 

404. 

380. 

403. 

386. 

405. 

386. 

3 

2 

1 

1 

3333. 

3379. 

3307. 

3399. 

3322. 

3387. 

3403. 

3 

2 

1 

2 

4623. 

4676. 

4594. 

4699. 

4611. 

4685. 

4675. 

3 

2 

1 

3 

3380. 

3427. 

3354. 

3447. 

3369. 

3435. 

3416. 

3 

2 

1 

4 

4115. 

4159. 

4090. 

4177. 

4104. 

4166. 

4108. 

3 

2 

1 

5 

3207. 

3240. 

3187. 

3252. 

3199. 

3245. 

3207. 

3 

2 

2 

1 

331. 

348. 

319. 

353. 

325. 

350. 

345. 

3 

2 

2 

2 

431. 

452. 

417. 

459. 

425. 

455. 

451. 

3 

2 

2 

3 

411. 

432. 

397. 

440. 

405. 

435. 

428. 

3 

2 

2 

4 

466. 

488. 

453. 

496. 

460. 

491. 

478. 

3 

2 

2 

5 

526. 

540. 

516. 

544. 

l 522. 

541. 

526. 

Sums 

48170 

49249 

47537 

49656 

47902 

49404 

49130 
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The  m values  corresponding  to  the  various  i values  of  Table  2 index 
the  entry  periods,  in  order  of  presentation  in  Table  1,  excluding  the 
July  71  entry  period.  Accordingly,  i = 1,  m = 1 corresponds  to  1-71 
entry  period;  for  i = 2,  m = 1 corresponds  to  1-70  entry  period,  etc. 

As  seen  in  Table  2 the  three  different  systems  of  prediction  limits 
provide  (on  retrospective  basis)  good  forecasting.  The  CMLE  intervals 
are  the  shortest  and  the  tolerance  intervals  are  the  largest.  As 
expected,  the  CMLE  intervals  might  be  however  too  short  and  the  actual 
coverage  probability  could  be  too  small.  Indeed,  in  Table  2 the  actual 
cohort  sizes  fall  outside  the  CMLE  intervals  in  15  out  of  36  cases, 
while  only  in  four  cases  they  are  not  covered  by  the  tolerance  intervals. 
When  we  sum  the  lower  and  the  upper  limits  over  all  cohorts  we  obtain 
more  conservative  prediction  intervals  for  the  total  force  in  Phase  II. 

We  see  in  Table  2 that  all  the  three  types  of  prediction  intervals 
(CMLE,  tolerance  and  Bayes)  yield  total  force  intervals  which  cover  the 
actual  value. 

6.  Estimating  the  Retention  Rate  After  Six  Months  of  Service 

Cohorts  which  at  the  time  of  forecasting  have  been  only  six  months 
in  service  should  be  considered  separately,  since  generally  the 
assumption  = is  invalid  f'or  t = 1*  Indeed, 

these  cohorts  just  start  their  Phase  II  of  service.  We  can,  however, 
estimate  in  these  cases  the  retention  probability  on  the  basis 

of  the  retention  rates  of  previous  cohorts  during  the  period  t = 2 
(from  six  till  twelve  months).  We  propose  the  usage  of  the  ratio  estimator 
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Wz)  - 


h X.  (2) 
V=1  ^kvv 


L x.  (1) 

, ljkv' 
v=l 


for  every  i 2 2,  Due  to  the  independence  of  the  random  variables 
X i j k jfc(  1 ) and  *ijkJ&(2)  as  ^ varies,  the  conditional  distribution  of 

A- 1 

^ xykv(2)>  given  (X^^l),  . . .,X.^  ^(l) },  is  the  binomial 


X^yCl),  9,  jkjj(2)  J-  We  assume  here  that  is  the  same 

for  all  entry  periods  (A  = 1,2,...)-  Accordingly  we  can  obtain  CMLE, 

tolerance  and  Bayes  prediction  limits  for  Y^^^(2)>  by  employing  the 

methods  of  Section  5 with  the  ratio  estimator  (6.l)  rather  than  the 

estimator  ( 5 • 1 ) - The  CMLE  prediction  limits  are  obtained  from  (5.2) 

by  substituting  (6.l)  for  ®ijkjj(t) • To  obtain  the  tolerance  limits  for 

X.  .(2)  we  define 

ljkA 

YIjkA(2)  = 2 sin’1^5ijkA(2)  < 


(l-ar)- confidence  intervals  for  H.  .(2)  are  then  given,  for  large 

i Jk  JL 

samples,  by 


Yi jkA^ 2^  - zl-a/2 


iFi 7” 

' v=l  "J1 


Tolerance  limits  for  Y. . .(2),  which  is  the  arcsin  transform  of 

1 j K.  X 

(Xyk^2)  + ^/(X^^l)  + T*)#  are  then 


YijkA(2)  + zl-a/2 


' A-l 

~L  X..,,  (!) 

v=l  ^kv 
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Substituting  in  ^5.  10 )-  ll)  the  two  limits  obtained  from  (6.4)  we 
obtain  tolerance  limits  for  X.^(2).  In  a similar  manner  we  can 
obtain  that  the  Bayes  prediction  limits  are 


* 

rijK£v 


Y_-  j,.  «(2)  ± zi-q(/2 


1/2 


A-l 

L X.  ..  (l) 

v=l 


XijkA(l) 


(6.5) 


Tolerance  and  Bayes  prediction  limits  for  X^j^C2)  are  obtained  by 
substituting  the  upper  and  lower  limits  of  (6.4)  and  (6.5),  respectively, 
in  (5.10)  and  (5.11),  or  in  (5.18)  and  (5. 19).  We  try  now  this  approach 
on  the  real  Marine  Corps  data  of  first  enlistees.  In  Table  3 we  provide 
the  number  of  first  enlistees  according  to  their  cohorts  which  remained 
in  service  at  epochs  t = 1 and  t = 2.  We  apply  then  the  above 
formulae  to  forecast  the  numbers  at  t = 2 on  the  basis  of  t = 1. 


TABLE  3 


NUMBER  OF  ENLISTEES  REMAINING  IN  SERVICE 
AFTER  6 AND  12  MONTHS 


Entry 
Period  l 

Contract 

Length 

LUS  - 

W 

LHS  - 

N 

AHS  - 

W 

AHS  - 

N 

(1969) 

6m 

12m 

6m 

12m 

6m 

12m 

6m 

12m 

1 

2 Yrs. 

6,057 

5,631 

1,647 

1,544 

12,116 

11,648 

2,127 

2,017 

2 

3,421 

3,146 

1,175 

1,080 

10,756 

10,385 

1,722 

1,637 

1 

3 Yre. 

1,767 

1,627 

204 

194 

2,374 

2,312 

211 

196 

2 

2,376 

2,166 

298 

270 

3,603 

3,501 

456 

433 

1 

4 Yre. 

3,145 

2,853 

441 

408 

4,217 

4,081 

471 

456 

2 

3,197 

2,864 

371 

334 

5,547 

5,371 

586 

554 

(1970) 

3 

2 Yre. 

1,440 

1,373 

496 

474 

3,915 

3,823 

869 

843 

4 

1,562 

1,493 

421 

400 

4,269 

4,169 

790 

769 

3 

3 Yrs. 

1,466 

1,372 

195 

187 

2,496 

2,228 

302 

292 

4 

1,208 

1,143 

134 

131 

2,496 

2,437 

257 

251 

3 

4 Yrs. 

3,370 

3,165 

432 

401 

3,934 

3,835 

532 

512 

4 

3,333 

3,173 

371 

356 

4,527 

4,401 

563 

547 

(1971) 

5 

2 Yrs. 

1,564 

1,489 

537 

525 

3,416 

3,351 

720 

702 

6 

1,809 

1,717 

744 

716 

4,553 

4,451 

1,235 

1,195 

5 

3 Yrs . 

1,293 

1,219 

181 

172 

1,180 

1,151 

190 

183 

6 

919 

866 

143 

133 

1,034 

997 

233 

227 

5 

4 Yre. 

3,325 

3,146 

423 

410 

3,366 

3,294 

539 

546 

6 

3,359 

3,174 

402 

383 

4,445 

4,281 

632 

607 

(1972) 

7 

2 Yrs. 

2,710 

2,585 

1,223 

1,177 

3,517 

3,413 

1,311 

1,273 

8 

1,659 

1,576 

729 

688 

3,149 

3,075 

1,052 

1,014 

7 

3 Yrs. 

2,067 

1,964 

434 

417 

1,329 

1,299 

324 

307 

8 

1,766 

1,669 

465 

441 

1,568 

1,528 

500 

477 

. 7 

4 Yrs. 

4,945 

4,696 

940 

893 

3,731 

3,620 

684 

655 

8 

4,680 

4,407 

1,577 

1,514 

4,461 

4,305 

1,393 

1,328 

rororoDorocororororo 
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Table  4 

Prediction  Limits  for  the  Size  of  Cohorts  at  the  Beginning  of 

Phase  II.  (a  = .05). 


i 

j 

k 

L 

CLE 

Tolerance 

Bayes 

Actual 

1 

1 

1 

2 

3151. 

3210. 

3127. 

3229. 

3142. 

3216. 

3146. 

i 

1 

1 

3 

1314. 

1353. 

1305. 

1359. 

1312. 

1353. 

1373. 

i 

1 

1 

4 

1432. 

1472. 

1423. 

1478. 

1430. 

1472. 

1493. 

i 

1 

1 

5 

1440. 

1478. 

1431. 

1484. 

1438. 

1479. 

1489- 

i 

1 

1 

6 

1671. 

1712. 

1662. 

1718. 

1669. 

1712. 

1717. 

i 

1 

1 

7 

2514. 

2563. 

2502. 

2572. 

2511. 

2564. 

2585. 

i 

1 

1 

8 

1539. 

1577- 

1532. 

1581. 

1537. 

1577. 

1576. 

i 

1 

2 

2 

1085. 

1118. 

1069. 

1129. 

1079- 

1121. 

1080. 

i 

1 

2 

3 

450. 

472. 

444. 

475. 

448. 

472. 

474. 

i 

1 

2 

4 

383. 

403. 

378. 

405- 

382. 

403. 

400. 

i 

1 

2 

5 

491. 

514. 

485- 

516. 

490. 

513. 

525. 

i 

1 

2 

6 

687. 

713. 

680. 

716. 

685. 

713. 

716. 

i 

1 

2 

7 

1139- 

1170. 

1129. 

1176. 

1136. 

1171. 

1177- 

i 

1 

2 

8 

679. 

703. 

673. 

705. 

677. 

702. 

688. 

i 

2 

1 

2 

10301. 

10380. 

10261. 

10413. 

10285. 

10393. 

10385. 

i 

2 

1 

3 

3748. 

3794. 

3737. 

3802. 

3745. 

3795- 

3823. 

i 

2 

1 

4 

4097- 

4i44. 

4o86. 

4152. 

4094. 

4145. 

4169. 

i 

2 

1 

5 

3282. 

3323. 

3274. 

3328. 

3280. 

3323. 

3351. 

i 

2 

1 

6 

4385. 

4431. 

4375- 

4438. 

4383. 

4432. 

4451. 

i 

2 

1 

7 

3389. 

3429. 

3381. 

3434. 

3387. 

3429- 

3413. 

i 

2 

1 

8 

3034. 

3072. 

3027. 

3075. 

3032. 

3071- 

3075- 

i 

2 

2 

2 

1615. 

1651. 

1596. 

1664 . 

1607. 

1656. 

1637. 

i 

2 

2 

812. 

838. 

8o4. 

842. 

810. 

838 

843. 

i 

2 

2 

4 

7^1. 

765. 

735. 

768. 

739. 

765. 

769. 

i 

2 

2 

5 

678. 

699. 

672. 

701. 

676. 

699. 

702. 

i 

2 

2 

6 

1170. 

1197- 

1162. 

1201. 

1167. 

1197. 

1195- 

i 

2 

2 

7 

1244. 

1272. 

1237- 

1276. 

1242. 

1272. 

1273. 

i 

2 

2 

8 

999- 

1024. 

993. 

1026. 

998. 

1023. 

1014. 

2 

1 

1 

2 

2162. 

2214. 

2128. 

2240. 

2146. 

2225. 

2166. 

2 

1 

1 

3 

1321. 

1363. 

1307- 

1373. 

1317. 

1365. 

1372. 

2 

1 

1 

4 

1094. 

1131. 

1084. 

1138. 

1091. 

1132. 

1143. 

2 

1 

1 

5 

1178. 

1215. 

1168. 

1221. 

1175. 

1216. 

1219. 

2 

1 

1 

6 

838. 

868. 

831. 

872. 

836. 

868. 

866. 

2 

1 

1 

7 

1899. 

19^4. 

1 886. 

1953. 

1895. 

1946. 

1964. 

2 

1 

1 

8 

1628. 

1669. 

1618. 

1675. 

1625. 

1670. 

1669. 

2 

1 

2 

2 

276. 

291. 

263. 

295. 

270. 

293. 

270. 

2 

1 

2 

3 

173. 

187. 

166. 

190. 

171. 

188. 

187. 

2 

1 

2 

4 

120. 

131. 

115- 

131. 

118. 

130. 

131. 

2 

1 

2 

5 

164. 

177. 

159- 

178. 

162. 

176. 

172. 

2 

1 

2 

6 

129. 

140. 

126. 

l4l. 

128. 

140. 

133. 

2 

1 

2 

7 

399. 

418. 

391. 

422. 

396. 

419. 

417. 

2 

1 

2 

8 

431. 

450. 

423. 

453. 

428. 

450. 

44l. 

2 

2 

1 

2 

3490. 

3528. 

3463. 

3546. 

3477. 

3536. 

3501. 

2 

2 

1 

3 

2412. 

2444. 

2399- 

2451. 

2407. 

2445. 

2228. 

2 

2 

1 

4 

2347* 

2390. 

2333. 

2400. 

2343. 

2392. 

2437. 

2 

2 

1 

5 

1113. 

Il4l. 

1107. 

1144. 

1112. 

1141. 

1151* 
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We  see  in  Table  -*  that  the  methods  suggested  in  the  present  section 
provide  prediction  intervals  which  compare  very  well  with  the  actual 
data. 
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TFU-TFU+FU 

410 

P-  (X+.5)/(Wfl.) 

420 

Q-l.-P 

430 

R-SQRT(P/Q) 

440 

Y-2.*ATAN(R) 

450 

SE-SQRT (1 . /X)+SQRT (1 . /W) 

460 

CL-Y-Z*SE 

470 

CU“Y+Z*SE 

480 

GL- ( (SIN (CL/ 2 . ) ) **2 ) * (X+l . )- . 5 

485 

GU«((SIN(CU/2.))**2)*(X+l.)-.5 

490 

TGL-TGL+GL 

495 

TGU-TGUfGU 

500 

SH“SQRT(1. X+l./w) 
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(coat'd) 


510 

BL-Y-Z*SH 

520 

BU-Y+Z*SH 

530 

HL- ( (SIN (BL/2. ) )**2)* (X+l. )- .5 

540 

HU«((SIN(BU/2.))**2)*(X+l.)-.5 

550 

THL-THL+HL 

560 

THU-THIH-HU 

570 

WRITE (66, 10)  I,J,K,L,FL,FU,GL,GU,HL,HU,FA 

580 

10 

FORMAT (41 3, 7F7.0) 

590 

9 

CONTINUE 

600 

8 

CONTINUE 

610 

7 

CONTINUE 

620 

6 

CONTINUE 

630 

WRITE (66, 11)  TFL.TFU ,TGL,TGU,THL,THU ,TFA 

640 

11 

FORMAT  (12  X,  7F7.0) 

650 

END 
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2.  FORTRAN  PROGRAM  FOR  THE  DETERMINATION  OF  PREDICTION  LIMITS;  PHASE  II 


COHORTS,  t 


1. 


too 

no 

Jl24L. 

130 

140 

150 

160 

165 

_1_67_ 

163 

170 

130 

1 90 

2 00 
210 
220 
230 
240 
250 
26  0 

27  0 _ 

28  0 
290 
300 
310 
320 
330 
34  0 
350 
360 
370 
38  0 
390 
4D(T“ 
410 
420 
430 
440 
450 

46  0 

47  0 
43  0 
49  0 
500 

51 Q 
520 
530 
540 
550 
5 60 
57  0 
53  0 " 
590 
600 
610 


DIMENSION  NXC 3, 2, 2,3) ,NY( 3, 2, 2, 3) 
Z»  1 . 96 
I A*  3 


JE=2 

KR=  2 

LA=3 

DO  1 M*  1 , 4 
MM=M- 1 

L=1+MM»2  

LL-L+r 
DO  2 I = 1 , I A 

READ( 50,3)  ( (NXC I , J, K,L) ,NYC I , J , K , L ) , X- 1 , KR) , J= 1 , JE) 
READ( 50,3)  C (NXC I, J,K,LL) ,NYCI , J,K,LL) ,K= 1 , KR) , J= 1 , JE) 
3 F0RMATC3I5) 

2 CONTINUE 


1 CONTINUE 
DO  4 1=  1 , IA 


DO  5 
DO  6 
SX=0  . 
SY=  0 . 
DO  7 


J=l, JE 
K= 1 , KR 


L=2,LA 
LL=L- 1 

X*NXC I , J,K,LL) 

SX=SX+X 
FX=NX( I, J,K,L) 

Y=NY< I , J, K,LL) 

SY= SY+Y 

FA=NY ( I, J,K,L) 

TET=SY/SX 
D=FX*TET 
V=D*( 1 .-TET) 

SE=  SORT < V ) 

FL=D-Z*SE 
FU=D+Z*SE 
R=TET/( 1 . -TET) 

SR=  SQRT  < R) 

W=2. *ATAN<  SR) 

SV=SQRT( l./SX)»SQRT( 1 ,/FX) 

CL=*W-Z*SW 

cu=w+z*sw 

GL= ( ( SINC  CL/2. ) )**2)*FX 
GU=( (SIN(CU/2. ) )»»2)*FX 
SV=  SORT ( 1 ./SX+1 ./FX) 

_BL=V-Z*SV 

8U*W+Z*SV 

HL= ( (SINC  BL/2. ) )**2)*FX 
HU* ( <SIN(BU/2. ) )**2)*FX 

WR I TEC  66, 10)  I , J , K, L , FL , FU, GL , GU, HL , HU , FA 
10  FORMAT C 413, 7F7. 0) 

7 CONTINUE 
6 CONTINUE 
5 CONTINUE 
4 CONTINUE 
END 
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